How to Scale Up Kernel Methods to Be As Good As Deep Neural Nets

نویسندگان

Zhiyun Lu

Avner May

Kuan Liu

Alireza Bagheri Garakani

Dong Guo

Aurélien Bellet

Linxi Fan

Michael Collins

Brian Kingsbury

Michael Picheny

Fei Sha

چکیده

In this paper, we investigate how to scale up kernel methods to take on large-scale problems, on which deep neural networks have been prevailing. To this end, we leverage existing techniques and develop new ones. These techniques include approximating kernel functions with features derived from random projections, parallel training of kernel models with 100 million parameters or more, and new schemes for combining kernel functions as a way of learning representations. We demonstrate how to muster those ideas skillfully to implement large-scale kernel machines for challenging problems in automatic speech recognition. We valid our approaches with extensive empirical studies on real-world speech datasets on the tasks of acoustic modeling. We show that our kernel models are equally competitive as wellengineered deep neural networks (DNNs). In particular, kernel models either attain similar performance to, or surpass their DNNs counterparts. Our work thus avails more tools to machine learning researchers in addressing large-scale learning problems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Kernel Methods for Deep Learning

We introduce a new family of positive-definite kernel functions that mimic the computation in large, multilayer neural nets. These kernel functions can be used in shallow architectures, such as support vector machines (SVMs), or in deep kernel-based architectures that we call multilayer kernel machines (MKMs). We evaluate SVMs and MKMs with these kernel functions on problems designed to illustr...

متن کامل

Scalable Kernel Methods via Doubly Stochastic Gradients

The general perception is that kernel methods are not scalable, so neural nets become the choice for large-scale nonlinear learning problems. Have we tried hard enough for kernel methods? In this paper, we propose an approach that scales up kernel methods using a novel concept called “doubly stochastic functional gradients”. Based on the fact that many kernel methods can be expressed as convex ...

متن کامل

Deep Semi-Random Features for Nonlinear Function Approximation

We propose semi-random features for nonlinear function approximation. The flexibility of semirandom feature lies between the fully adjustable units in deep learning and the random features used in kernel methods. For one hidden layer models with semi-random features, we prove with no unrealistic assumptions that the model classes contain an arbitrarily good function as the width increases (univ...

متن کامل

Learning Multi-Scale Deep Features for High-Resolution Satellite Image Classification

In this paper, we propose a multi-scale deep feature learning method for high-resolution satellite image classification. Specifically, we firstly warp the original satellite image into multiple different scales. The images in each scale are employed to train a deep convolutional neural network (DCNN). However, simultaneously training multiple DCNNs is time-consuming. To address this issue, we e...

متن کامل

Integration of Deep Learning Algorithms and Bilateral Filters with the Purpose of Building Extraction from Mono Optical Aerial Imagery

The problem of extracting the building from mono optical aerial imagery with high spatial resolution is always considered as an important challenge to prepare the maps. The goal of the current research is to take advantage of the semantic segmentation of mono optical aerial imagery to extract the building which is realized based on the combination of deep convolutional neural networks (DCNN) an...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

CoRR

دوره abs/1411.4000 شماره

صفحات -

تاریخ انتشار 2014

How to Scale Up Kernel Methods to Be As Good As Deep Neural Nets

نویسندگان

چکیده

منابع مشابه

Kernel Methods for Deep Learning

Scalable Kernel Methods via Doubly Stochastic Gradients

Deep Semi-Random Features for Nonlinear Function Approximation

Learning Multi-Scale Deep Features for High-Resolution Satellite Image Classification

Integration of Deep Learning Algorithms and Bilateral Filters with the Purpose of Building Extraction from Mono Optical Aerial Imagery

عنوان ژورنال:

اشتراک گذاری